AITopics | step training step training step

Collaborating Authors

step training step training step

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Understanding and Minimising Outlier Features in Neural Network Training

He, Bobby, Noci, Lorenzo, Paliotta, Daniele, Schlag, Imanol, Hofmann, Thomas

arXiv.org Artificial IntelligenceMay-29-2024

Outlier Features (OF) are neurons whose activation magnitudes significantly exceed the average over a neural network's (NN) width. They are well known to emerge during standard transformer training and have the undesirable effect of hindering quantisation in afflicted models. Despite their practical importance, little is known behind why OFs emerge during training, nor how one can minimise them. Our work focuses on the above questions, first identifying several quantitative metrics, such as the kurtosis over neuron activation norms, to measure OFs. With these metrics, we study how architectural and optimisation choices influence OFs, and provide practical insights to minimise OFs during training. As highlights, we emphasise the importance of controlling signal propagation throughout training, and propose the Outlier Protected transformer block, which removes standard Pre-Norm layers to mitigate OFs, without loss of convergence speed or training stability. Overall, our findings shed new light on our understanding of, our ability to prevent, and the complexity of this important facet in NN training dynamics.

international conference, kurtosis, ofs, (10 more...)

arXiv.org Artificial Intelligence

2405.19279

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > Dominican Republic (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

On Training Data Influence of GPT Models

Liu, Qingyi, Chai, Yekun, Wang, Shuohuan, Sun, Yu, Peng, Qiwei, Wang, Keze, Wu, Hua

arXiv.org Artificial IntelligenceApr-16-2024

Amidst the rapid advancements in generative language models, the investigation of how training data shapes the performance of GPT models is still emerging. This paper presents GPTfluence, a novel approach that leverages a featurized simulation to assess the impact of training examples on the training dynamics of GPT models. Our approach not only traces the influence of individual training instances on performance trajectories, such as loss and other key metrics, on targeted test points but also enables a comprehensive comparison with existing methods across various training scenarios in GPT models, ranging from 14 million to 2.8 billion parameters, across a range of downstream tasks. Contrary to earlier methods that struggle with generalization to new data, GPTfluence introduces a parameterized simulation of training dynamics, demonstrating robust generalization capabilities to unseen training data. This adaptability is evident across both fine-tuning and instruction-tuning scenarios, spanning tasks in natural language understanding and generation. We will make our code and data publicly available.

simulation, step training step training step, training data, (14 more...)

arXiv.org Artificial Intelligence

2404.0784

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Education (0.48)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Consciousness-Inspired Spatio-Temporal Abstractions for Better Generalization in Reinforcement Learning

Zhao, Mingde, Alver, Safa, van Seijen, Harm, Laroche, Romain, Precup, Doina, Bengio, Yoshua

arXiv.org Artificial IntelligenceFeb-4-2024

Inspired by human conscious planning, we propose Skipper, a model-based reinforcement learning agent utilizing spatio-temporal abstractions to generalize learned skills in novel situations. It automatically decomposes the given task into smaller, more manageable subtasks, and hence enables sparse decision-making and focused computation on the relevant parts of the environment. This relies on the extraction of an abstracted proxy problem represented as a directed graph, in which vertices and edges are learned end-to-end from hindsight. Our theoretical analyses provide performance guarantees under appropriate assumptions and establish where our approach is expected to be helpful. Generalization-focused experiments validate Skipper's significant advantage in zero-shot generalization, compared to existing state-of-the-art hierarchical planning methods.

agent, checkpoint, diff 0, (13 more...)

arXiv.org Artificial Intelligence

2310.00229

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.93)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Goal-Conditioned Q-Learning as Knowledge Distillation

Levine, Alexander, Feizi, Soheil

arXiv.org Artificial IntelligenceMar-8-2023

Many applications of reinforcement learning can be formalized as goal-conditioned environments, where, in each episode, there is a "goal" that affects the rewards obtained during that episode but does not affect the dynamics. Various techniques have been proposed to improve performance in goal-conditioned environments, such as automatic curriculum generation and goal relabeling. In this work, we explore a connection between off-policy reinforcement learning in goal-conditioned settings and knowledge distillation. In particular: the current Q-value function and the target Q-value estimate are both functions of the goal, and we would like to train the Q-value function to match its target for all goals. We therefore apply Gradient-Based Attention Transfer (Zagoruyko and Komodakis 2017), a knowledge distillation technique, to the Q-function update. We empirically show that this can improve the performance of goal-conditioned off-policy reinforcement learning when the space of goals is high-dimensional. We also show that this technique can be adapted to allow for efficient learning in the case of multiple simultaneous sparse goals, where the agent can attain a reward by achieving any one of a large set of objectives, all specified at test time. Finally, to provide theoretical support, we give examples of classes of environments where (under some assumptions) standard off-policy algorithms such as DDPG require at least O(d^2) replay buffer transitions to learn an optimal policy, while our proposed technique requires only O(d) transitions, where d is the dimensionality of the goal and state space. Code is available at https://github.com/alevine0/ReenGAGE.

machine learning, reengage, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2208.13298

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > Puerto Rico > San Juan > San Juan (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Decentralized Motion Planning for Multi-Robot Navigation using Deep Reinforcement Learning

Kandhasamy, Sivanathan, Kuppusamy, Vinayagam Babu, Samak, Tanmay Vilas, Samak, Chinmay Vilas

arXiv.org Artificial IntelligenceNov-11-2020

This work presents a decentralized motion planning framework for addressing the task of multi-robot navigation using deep reinforcement learning. A custom simulator was developed in order to experimentally investigate the navigation problem of 4 cooperative non-holonomic robots sharing limited state information with each other in 3 different settings. The notion of decentralized motion planning with common and shared policy learning was adopted, which allowed robust training and testing of this approach in a stochastic environment since the agents were mutually independent and exhibited asynchronous motion behavior. The task was further aggravated by providing the agents with a sparse observation space and requiring them to generate continuous action commands so as to efficiently, yet safely navigate to their respective goal locations, while avoiding collisions with other dynamic peers and static obstacles at all times. The experimental results are reported in terms of quantitative measures and qualitative remarks for both training and deployment phases.

agent, motion planning, robot, (14 more...)

arXiv.org Artificial Intelligence

2011.05605

Country: Asia > India > Tamil Nadu (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.93)

Add feedback